18 research outputs found

    Investigation of normalization techniques and their impact on a recognition rate in handwritten numeral recognition

    Get PDF
    This paper presents several normalization techniques used in handwritten numeral recognition and their impact on recognition rates. Experiments with five different feature vectors based on geometric invariants, Zernike moments and gradient features are conducted. The recognition rates obtained using combination of these methods with gradient features and the SVM-rbf classifier are comparable to the best state-of-art techniques

    A novel classifier combining supervised and unsupervised learning methods

    No full text
    Nearest Centroid Neighbor (NCN) classifier is a fast and simple algorithm representing supervised methods of data classification. This algorithm assumes that all classes can be represented by the individual clusters and the classes means (centroids) are used to determine to which class a new unknown sample belongs. However the assumption that each class consists of one and exactly one cluster limits the performance of the algorithm. A method of replacing this one cluster by two or more subclusters is proposed in this paper. The new algorithm uses a well-known unsupervised classification method k-means clustering to find the centroids of these subclusters. This modification of NCN classifier improves the results achieved on several databases

    Combining k-nearest neighbor and centroid neighbor classifier for fast and robust classification

    No full text
    The k-NN classifier is one of the most known and widely used nonparametric classifiers. The k-NN rule is optimal in the asymptotic case which means that its classification error aims for Bayes error if the number of the training samples approaches infinity. A lot of alternative extensions of the traditional k-NN have been developed to improve the classification accuracy. However, it is also well-known fact that when the number of the samples grows it can become very inefficient because we have to compute all the distances from the testing sample to every sample from the training data set. In this paper, a simple method which addresses this issue is proposed. Combining k-NN classifier with the centroid neighbor classifier improves the speed of the algorithm without changing the results of the original k-NN. In fact usage confusion matrices and excluding outliers makes the resulting algorithm much faster and robust

    Creating effective error correcting output codes for multiclass classification

    No full text
    The error correcting output code (ECOC) technique is a genesral framework to solve the multi-class problems using binary classifiers. The key problem in this approach is how to construct the optimal ECOC codewords i.e. the codewords which maximize the recognition ratio of the final classifier. There are several methods described in the literature to solve this problem. All these methods try to maximize the minimal Hamming distance between the generated codewords. In this paper we are showing another approach based both on the average Hamming distance and the estimated misclassification error of the binary classifiers

    Combining one-versus-one and one-versus-all strategies to improve multiclass SVM classifier

    No full text
    Support Vector Machine (SVM) is a binary classifier, but most of the problems we find in the real-life applications are multiclass. There are many methods of decomposition such a task into the set of smaller classification problems involving two classes only. Two of the widely known are one-versus-one and one-versus-rest strategies. There are several papers dealing with these methods, improving and comparing them. In this paper, we try to combine theses strategies to exploit their strong aspects to achieve better performance. As the performance we understand both recognition ratio and the speed of the proposed algorithm. We used SVM classifier on several different databases to test our solution. The results show that we obtain better recognition ratio on all tested databases. Moreover, the proposed method turns out to be much more efficient than the original one-versus-one strategy

    A hybrid discriminative/generative approach to protein fold recognition

    No full text
    There are two standard approaches to the classification task: generative, which use training data to estimate a probability model for each class, and discriminative, which try to construct flexible decision boundaries between the classes. An ideal classifier should combine these two approaches. In this paper a classifier combining the well-known support vector machine (SVM) classifier with regularized discriminant analysis (RDA) classifier is presented. The hybrid classifier is used for protein structure prediction which is one of the most important goals pursued by bioinformatics. The obtained results are promising, the hybrid classifier achieves better result than the SVM or RDA classifiers alone. The proposed method achieves higher recognition ratio than other methods described in the literature

    Using the one-versus-rest strategy with samples balancing to improve pairwise coupling classification

    No full text
    The simplest classification task is to divide a set of objects into two classes, but most of the problems we find in real life applications are multi-class. There are many methods of decomposing such a task into a set of smaller classification problems involving two classes only. Among the methods, pairwise coupling proposed by Hastie and Tibshirani (1998) is one of the best known. Its principle is to separate each pair of classes ignoring the remaining ones. Then all objects are tested against these classifiers and a voting scheme is applied using pairwise class probability estimates in a joint probability estimate for all classes. A closer look at the pairwise strategy shows the problem which impacts the final result. Each binary classifier votes for each object even if it does not belong to one of the two classes which it is trained on. This problem is addressed in our strategy. We propose to use additional classifiers to select the objects which will be considered by the pairwise classifiers. A similar solution was proposed by Moreira and Mayoraz (1998), but they use classifiers which are biased according to imbalance in the number of samples representing classes

    A combined SVM-RDA classifier for protein fold recognition

    No full text
    Predicting the three-dimensional (3D) structure of a protein is a key problem in molecular biology. It is also an interesting issue for statistical methods recognition. There are many approaches to this problem considering discriminative and generative classifiers. In this paper a classifier combining the well-known support vector machine (SVM) classifier with regularized discriminant analysis (RDA) classifier is presented. It is used on a real world data set. The obtained results are promising improving previously published method

    Using the one鈥搗ersus鈥搑est strategy with samples balancing to improve pairwise coupling classification

    No full text
    The simplest classification task is to divide a set of objects into two classes, but most of the problems we find in real life applications are multi-class. There are many methods of decomposing such a task into a set of smaller classification problems involving two classes only. Among the methods, pairwise coupling proposed by Hastie and Tibshirani (1998) is one of the best known. Its principle is to separate each pair of classes ignoring the remaining ones. Then all objects are tested against these classifiers and a voting scheme is applied using pairwise class probability estimates in a joint probability estimate for all classes. A closer look at the pairwise strategy shows the problem which impacts the final result. Each binary classifier votes for each object even if it does not belong to one of the two classes which it is trained on. This problem is addressed in our strategy. We propose to use additional classifiers to select the objects which will be considered by the pairwise classifiers. A similar solution was proposed by Moreira and Mayoraz (1998), but they use classifiers which are biased according to imbalance in the number of samples representing classes

    An improved protein fold recognition with support vector machines

    No full text
    Predicting the three-dimensional structure (fold) of a protein is a key problem in molecular biology. It is also interesting issue for statistical methods recognition. In this paper a multi-class support vector machine (SVM) classifier is used on a real world data set. The SVM is a binary classifier, but protein fold recognition is a multi-class problem. So several new approaches to deal with this issue are presented including a modification of the well-known one-versus-one strategy. However, in this strategy the number of different binary classifiers that must be trained is quickly increasing with the number of classes. The methods proposed in this paper show how this problem can be overcome
    corecore